药物误解是可能导致对患者造成不可预测后果的风险之一。为了减轻这种风险,我们开发了一个自动系统,该系统可以正确识别移动图像中的药丸的处方。具体来说,我们定义了所谓的药丸匹配任务,该任务试图匹配处方药中药丸所拍摄的药丸的图像。然后,我们提出了PIMA,这是一种使用图神经网络(GNN)和对比度学习来解决目标问题的新方法。特别是,GNN用于学习处方中文本框之间的空间相关性,从而突出显示带有药丸名称的文本框。此外,采用对比度学习来促进药丸名称的文本表示与药丸图像的视觉表示之间的跨模式相似性的建模。我们进行了广泛的实验,并证明PIMA在我们构建的药丸和处方图像的现实数据集上优于基线模型。具体而言,与其他基线相比,PIMA的准确性从19.09%提高到46.95%。我们认为,我们的工作可以为建立新的临床应用并改善药物安全和患者护理提供新的机会。
translated by 谷歌翻译
单个图像超分辨率(SISR)是一个非常活跃的研究领域。本文通过使用带有双鉴别器的GaN的方法来解决SISR,并将其与注意机制合并。实验结果表明,与其他传统方法相比,GDCA可以产生更尖锐和高令人愉悦的图像。
translated by 谷歌翻译
自动驾驶汽车(AV)必须在动态环境中安全有效地操作。为此,配备联合雷达通信(JRC)功能的AVS可以通过使用雷达检测和数据通信功能来增强驾驶安全性。但是,在不确定性和周围环境的动态下,通过两种不同功能优化AV系统的性能非常具有挑战性。在这项工作中,我们首先提出一个基于马尔可夫决策过程(MDP)的智能优化框架,以帮助AV在周围环境的动态和不确定性下选择JRC操作功能时做出最佳决策。然后,我们开发了一种有效的学习算法,利用了深度强化学习技术的最新进展,以找到AV的最佳政策,而无需任何有关周围环境的先前信息。此外,为了使我们提出的框架更加可扩展,我们开发了一种转移学习(TL)机制,该机制使AV能够利用有价值的体验来加速培训过程,以加速培训过程。广泛的模拟表明,与其他常规的深钢筋学习方法相比,提议的可转移深钢筋学习框架可将AV的障碍检测概率降低到67%。
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments.
translated by 谷歌翻译
We introduce an approach for the answer-aware question generation problem. Instead of only relying on the capability of strong pre-trained language models, we observe that the information of answers and questions can be found in some relevant sentences in the context. Based on that, we design a model which includes two modules: a selector and a generator. The selector forces the model to more focus on relevant sentences regarding an answer to provide implicit local information. The generator generates questions by implicitly combining local information from the selector and global information from the whole context encoded by the encoder. The model is trained jointly to take advantage of latent interactions between the two modules. Experimental results on two benchmark datasets show that our model is better than strong pre-trained models for the question generation task. The code is also available (shorturl.at/lV567).
translated by 谷歌翻译
Machine Learning as a service (MLaaS) permits resource-limited clients to access powerful data analytics services ubiquitously. Despite its merits, MLaaS poses significant concerns regarding the integrity of delegated computation and the privacy of the server's model parameters. To address this issue, Zhang et al. (CCS'20) initiated the study of zero-knowledge Machine Learning (zkML). Few zkML schemes have been proposed afterward; however, they focus on sole ML classification algorithms that may not offer satisfactory accuracy or require large-scale training data and model parameters, which may not be desirable for some applications. We propose ezDPS, a new efficient and zero-knowledge ML inference scheme. Unlike prior works, ezDPS is a zkML pipeline in which the data is processed in multiple stages for high accuracy. Each stage of ezDPS is harnessed with an established ML algorithm that is shown to be effective in various applications, including Discrete Wavelet Transformation, Principal Components Analysis, and Support Vector Machine. We design new gadgets to prove ML operations effectively. We fully implemented ezDPS and assessed its performance on real datasets. Experimental results showed that ezDPS achieves one-to-three orders of magnitude more efficient than the generic circuit-based approach in all metrics while maintaining more desirable accuracy than single ML classification approaches.
translated by 谷歌翻译
Inverse medium scattering solvers generally reconstruct a single solution without an associated measure of uncertainty. This is true both for the classical iterative solvers and for the emerging deep learning methods. But ill-posedness and noise can make this single estimate inaccurate or misleading. While deep networks such as conditional normalizing flows can be used to sample posteriors in inverse problems, they often yield low-quality samples and uncertainty estimates. In this paper, we propose U-Flow, a Bayesian U-Net based on conditional normalizing flows, which generates high-quality posterior samples and estimates physically-meaningful uncertainty. We show that the proposed model significantly outperforms the recent normalizing flows in terms of posterior sample quality while having comparable performance with the U-Net in point estimation.
translated by 谷歌翻译
Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions. In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches.
translated by 谷歌翻译
Air pollution is an emerging problem that needs to be solved especially in developed and developing countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on developing a solution that can estimate the emitted PM2.5 pollutants by counting the number of vehicles in the traffic. We first investigated among the recent object detection models and developed our own traffic surveillance system. The observed traffic density showed a similar trend to the measured PM2.5 with a certain lagging in time, suggesting a relation between traffic density and PM2.5. We further express this relationship with a mathematical model which can estimate the PM2.5 value based on the observed traffic density. The estimated result showed a great correlation with the measured PM2.5 plots in the urban area context.
translated by 谷歌翻译